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ABSTRACT 

We present results from the first twelve months of operation of Radio Galaxy Zoo, 
which upon completion will enable visual inspection of over 170,000 radio sources to 
determine the host galaxy of the radio emission and the radio morphology. Radio 
Galaxy Zoo uses 1.4 GHz radio images from both the Faint Images of the Radio Sky 
at Twenty Centimeters (FIRST) and the Australia Telescope Large Area Survey (AT¬ 
LAS) in combination with mid-infrared images at 3.4/rm from the Wide-field Infrared 
Survey Explorer (WISE) and at 3.6 fim from the Spitzer Space Telescope. We present 
the early analysis of the WISE mid-infrared colours of the host galaxies. For images 
in which there is > 75% consensus among the Radio Galaxy Zoo cross-identifications, 
the project participants are as effective as the science experts at identifying the host 
galaxies. The majority of the identified host galaxies reside in the mid-infrared colour 
space dominated by elliptical galaxies, quasi-stellar objects (QSOs), and luminous in¬ 
frared radio galaxies (LIRGs). We also find a distinct population of Radio Galaxy Zoo 
host galaxies residing in a redder mid-infrared colour space consisting of star-forming 
galaxies and/or dust-enhanced non star-forming galaxies consistent with a scenario 
of merger-driven active galactic nuclei (AGN) formation. The completion of the full 
Radio Galaxy Zoo project will measure the relative populations of these hosts as a 
function of radio morphology and power while providing an avenue for the identifica¬ 
tion of rare and extreme radio structures. Currently, we are investigating candidates 
for radio galaxies with extreme morphologies, such as giant radio galaxies, late-type 
host galaxies with extended radio emission, and hybrid morphology radio sources. 

Key words: methods: data analysis — radio continuum: galaxies — infrared: galax¬ 
ies. 


1 INTRODUCTION 

Large radio continuum surveys over the past 60 years have 
played a key role in our understanding of the evolution of 
galaxies across cosmic time. These surveys are typically lim¬ 
ited to flux densities of Si. 4 > 1 mjy at 1.4 GHz (21 cm), and 
are consequently dominated by active galactic nuclei (AGN) 
with 1.4 GHz luminosities of L1.4 > 10 23 W Hz -1 (e.g. 
Mauch & Sadler 2007 Mao et al. |2012| ) . The largest such 


survey, the NRAO VLA Sky Survey (NVSS; Condon et al. 
1998]) is relatively shallow with a completeness level of 50% 
at 2.5 mjy beam -1 and 99% at 3.4 mjy beam -1 . For surveys 
sensitive to flux densities below 1 mjy, the radio emission 
is a combinat ion of: (1) low-lu minosity AGN (L1.4 < 10 22 
W Hz -1 ; e.g. |Slee et a l. 19941; and (2) star formation (e.g. 


Condon et al.||2012 1. Current deep (Si.4 < 15pJy beam 


) radio continuum surveys are limited to < 10 square de¬ 
grees of the sky (e.g., Owen fe Morrison|2008 Smolcic et al. 


2009 Condon et al.|2012 Franzen et al.|2015 1 resulting from 
available observing time. 

Over the next 5 to 10 years, the next generation ra¬ 
dio telescopes and telescope upgrades such as the Aus¬ 


tralian SKA Pathfinder (ASKAP; Johnston et al. 20071, 
MeerKAT (Jonas ][2009| and Apertif (Verheijen et al.| 2008]) 
will perform surveys with higher angular resolution and 
sensitivity that cover wider fields. In particular, the wide- 
area surveys such as the Evolutionary Map of the Universe 


survey (EMU; Norris et al. 20111 using ASKAP; and the 


WODAN survey (Rottgering et al. 20111 using the Aper 


tif upgrade on the Westerbork Synthesis Radio Telescope 
(WSRT) will together provide all-sky coverage to a rms sen¬ 
sitivity of « 10 — 20 pJy beam -1 and cover a large spectral 
range at < 15 arcsec resolution. The combination of EMU 
and WODAN is expected to detect over 100 million radio 


sources, compared to the total of « 2.5 million radio sources 
currently known. 

These wideheld surveys will be complemented by deeper 
held studies over smaller sky areas from facilities such as 
MeerKAT in the Southern Hemisphere. Currently, it is 


planned that the MeerKAT MIGHTEE survey (e.g. Jarvis 
2012) will reach ~ 1.0 /rJy beam -1 rms over 35 square de¬ 


grees of the best-studied extragalactic deep fields accessible 
from South Africa. Together, these surveys will provide an 
unprecedented view of activity in the Universe addressing 
many key science questions on the evolution of AGN and 
star formation in galaxies as well as the cosmic large-scale 
structure. 

To harvest new scientific knowledge from these very 
large surveys, the detected radio sources need to be cross- 
identified with galaxies observed at other wavelengths. The 
task of cross-matching a radio source with its host galaxy 
is complicated by the large and complex radio source 
structures that are often found in radio-loud AGN. For 
survey samples of several thousand sources, radio cross- 
identihcations have traditionally been performed through 


visual inspections (e.g. 

Norris et al. 2006 Middelberg et 

al. 2008 Gendre et al. 

2010 Lin et al. 2010). Automated 


radio classification algorithms are still in the infancy stage; 


Norris et al. (2011) estimated that approximately 10% of the 


70 million radio sources expected from the EMU survey will 
be too complicated for current automated algorithms (e.g. 
Proctor|2006| [Kimball fe Ivezic|2008| |van Velzen et al.|20l5l 


Fan et al. |2015 1. Importantly, these complex sources are also 
likely to be those with the greatest scientific potential. 

To test possible solutions to this cross-identification is- 
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sue, we have created Radio Galaxy ZcxQ an online citi¬ 
zen science project based upon the concepts of the origi¬ 
nal Galaxy Zoo (Lintott et al. 2008). Following its launch in 
2007, the success of the Galaxy Zoo project inspired the cre¬ 
ation of the Zooniverstj^J now a highly successful platform for 
online citizen science, hosting more than 30 projects across 
a diverse selection of research areas (from astronomy, to his¬ 
tory and biology), and with over 1.4 million users. Zooni- 
verse projects share a common philosophy of “real research 
online” with a clear research goal and a real need for human 
input. The first project, Galaxy Zoo, has produced over 50 
peer reviewed publications to date (for a recent summary, 
see e.g. Fortson et al. |201 2). 

In Radio Galaxy Zoo, the public is asked to cross-match 
radio sources, often with complex structures, to their corre¬ 
sponding host galaxies observed in infrared images. The im¬ 
portance and complexities of radio source morphologies are 
described in Section[2] Section [3] describes the Radio Galaxy 
Zoo project. Early analyses of the reliability of Radio Galaxy 
Zoo source cross-identifications and classifications are dis¬ 
cussed in Section [4] Section [5] presents the science outcomes 
obtained from the first year of project operation. We sum¬ 
marise our project and early results in Section [6] Through¬ 
out this paper we adopt a ACDM cosmology of = 0.3, 
£I a = 0.7 with a Hubble constant of Ho = 70 km s” 1 Mpc” 1 . 


2 RADIO SOURCE MORPHOLOGIES 

In low-redshift radio sources, a combination of radio mor¬ 
phology and radio spectral index is useful for distinguishing 
whether the observed radio emission is dominated by star 
formation or AGN with the presence of core-jet, double- or 
triple- radio sources providing evidence for AGN-dominated 
emission. On the other hand, the combination of radio 
and infrared observations prove to be the most effective 
means for differentiating between AGN- and star formation- 


dominated emission at higher redshifts (e.g. Seymour et al. 
|2008} [Seymour|2009 1. 


A key step in determining radio source physical proper¬ 
ties is the determination of their distance through redshifts 
associated with the identification of the host galaxies from 
which the radio emission originate. The difficulty in cross- 
identification can be exemplified by the case of a linear align¬ 
ment of three radio sources (e.g. Norris et al. |2006 1 which 
can be either: (1) a chance alignment of radio emission from 
three separate galaxies; (2) three radio components from a 
single radio-loud AGN with two extended radio lobes; or (3) 
the chance alignment of a double radio source and a compact 
radio source. 

While the vast majority (~ 90%) of radio sources are 


compact in structure (Shabala et al. 2008 Sadler et al. 


|2014[), the extended morphologies of radio-loud sources were 


first classified by Fanaroff & Riley (1974 ) based on 57 sources 


from the Third Cambridge (3C) Radio sample (Mackay 
1971]). Fanaroff & Riley (1974) separated their sample of 
radio galaxies according to the ratio of the distance be¬ 
tween the regions of highest brightness on opposite sides 


of the host, to the total source extent from one end to the 
opposite; a ratio below 0.5 was class I, and a ratio above 
0.5 was class II, now known as the “Fanaroff-Riley” types 
FR-I and FR-II, respectively. They also found a sharp di¬ 
vision in radio luminosity density between the two classes 
at Z/i 78 MHz ~ 2 x 10 25 W Hz” 1 sr” 1 , with FR-II sources 
above and FR-I sources below this luminosity density. This 


classification was later confirmed by Owen & Ledlow (19941 


who found that this break between FR-II and FR-I radio 
sources also correlates with optical luminosity. However, this 
correlation with the optical luminosity consists of signifi¬ 
cant overlap between the two populations (e.g. ) Best|2 0091. 
Further investigation into FR-II and FR-I sources has pro¬ 
duced a number of different radio source morphologies. Un¬ 
usual classes of radio source morphologies include Narrow 
Angle Tail (NAT; Rudnick fc Owen|1976 l, Wide Angle Tail 


(WAT; Owen fc Rudnick] 19761, and Hybrid Morphology 
Radio Sources (HyMoRS; Gopal-Krishna fc Wiita] 20001. 
The NAT sources are usually thought to have high pecu¬ 


liar velocities (e.g. Venkatesan et al. 19941, whereas WATs 


are mostly associated with galaxy clusters where the ICM 
density and the relative motions of the cluster galaxies are 
responsible for the shape and structure of the observed ra¬ 
dio sources (e.g. Owen fc Rudnick||1976 Rudnick & Owen 


1976 Burns|[T998 1. The current method of determining the 

morphology of extended radio sources is by visual inspection 
and as a result this is only applicable to samples of no more 


than a few thousand radio sources (e.g. Middelberg et al. 

20081 . 


In Fig. [I] we present four examples of extended radio 
morphologies that can be found in galaxies. Fig. ]]Ja) shows 
an example of a FR-I radio source, 3C31 from NRAO/AUI 
by R. Laing, A. Bridle, R. Rearly, L. Feretti, G. Giovannini, 
and P. Parma. Fig.[ljB) shows a FR-II radio source, 3C353, 
with hotspots in both radio lobes as well as the narrow jet 
and counterjet from NRAO/AUI by M. Swain, A. Bridle, 
and S. Baum. We show radio source, 3C288, with more com¬ 
plex structures in Fig. [TJc) from NRAO/AUI by A. Bridle, 
J.Calicut and E. Fomalont. 3C288 exhibits an unusual asym¬ 
metry in its radio morphology and edge-darkening can only 
be observed on one side of this double-lobed radio source. 
Finally, Fig. H d ) shows 3C465, an example of a WAT source 
from the Atlas of DRAGNs ( |Leahy et al.|1996| ) by F. Owen. 


3 RADIO GALAXY ZOO 

Radio Galaxy Zoo is an online citizen science project where 
volunteers classify radio galaxies and their host galaxies via 
a web interface. The main purpose of Radio Galaxy Zoo is to 
produce cross-identifications for resolved radio sources which 
are too complex (i.e. where the two radio lobes are widely 
separated or where the radio morphology is asymmetrical 
or otherwise complex) for automated source matching algo¬ 
rithms (e.g. Becker et al. 1 1995 McMahon et al.|2002 Kim 


1 http://radio.galaxyzoo.org 

2 http://www.zooniverse.org 


|ball fc Ivez ic 2008; Proctor|2011||van Velzen et al.|20i5[ ). In 

the current phase of the project, we are offering to the volun¬ 
teers a total of 177,218 radio sources from two radio surveys 
described in the following subsections. To address our need 
for the classifications of complex radio source morphologies, 
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Figure 1 . Four examples of various radio loud galaxy morphologies, (a) FR-I radio source 3C31 at 1.4 GHz with the VLA from 
NRAO/AUI (http://images.nrao.edu/AGN/Radio_Galaxies/) by R. Laing, A. Bridle, R. Perley, L. Feretti, G. Giovannini, and P. Parma 
( |Laing|1996| |. (b) 3C353 at 8.4 GHz with the VLA from NRAO/AUI by M. Swain, A. Bridle, and S. Ba um (|Swain et al.|1998| ). (c) 3C288 
at 8.4GHz with the VLA from NRAO/AUI by A. Bridle, J. Calicut, and E. Fomalont ( |Bridle et al.||1989) . (d) A WAT radio source, 
3C465, in Abell 2634 at 1.4 GHz with the VLA from the Atlas of DRAGNs (http://www.jb.man.ac.uk/atlas/object/3C465.html) by 
F. Owen (Eilek et al. 1984|[Leahy et al. 1996[)■ 


we have biased our sample against unresolved sources as 
described in Section f3.1.1l 

Although initially designed as a pilot study in prepara¬ 
tion for the 7 million complex radio sources from the upcom¬ 
ing EMU survey, we are currently exploring the inclusion of 
other radio surveys for subsequent phases of this project. 
In addition to being an alternative technique to process¬ 
ing large datasets, the result of Radio Galaxy Zoo will also 
provide an ideal training dataset for the development and 
implementation of future-generation machine-learning algo¬ 
rithms in the field of pattern recognition. 


3.1 Data 


We extracted the radio sources for this project from the 
Faint Images of the Radio Sky at Twenty Centimeters 
(FIRST; White et al. 1997| Becker et al. | 19951 and the 
Australia Telescope Large Area Survey Data Release 3 (AT¬ 
LAS; Franzen et al. (2015) submitted to MNRAS). We chose 


FIRST over NVSS (Condon et al. 19981 due to its higher 


resolution and greater depth, making it more comparable to 
ATLAS and EMU. We expect many of these radio sources 
to be at high redshifts so observations of the host galaxies’ 
stellar components are typically derived from infrared sur¬ 
veys to reduce the effects of dust obscuration. In our case, we 


offer overlays of the FIRST and ATLAS fields to equivalent 
fields in the mid-infrared wavelengths from the Wide-Field 
Infrared Survey Explorer (WISE; Wright et al. 2010) and the 
Spitzer Wide-Area Infrared Extragalactic Survey (SWIRE; 
Lonsdale et al.|2003l surveys, respectively. 


3.1.1 FIRST and WISE data 


The majority of the data in Radio Galaxy Zoo comes from 
the 1.4 GHz FIRST survey (catalogue version 14 March 
2004) and the 3.4 p m WISE survey (all-sky data release in 
March 2012; |Cutri fe et al.|[2013| ). FIRST covers over 9000 
square degrees of the northern sky down to a 1 <t noise level of 
150 pJy beam -1 at 5 " resolution. WISE is an all-sky survey 
at wavelengths 3.4, 4.6, 12, and 22 fi m with 5cr point source 
sensitivity in unconfused regions of no worse than 0.08, 0.11, 
1.0, and 6.0mJy (Wrigh t"et al.|20i 0). These four wavebands 
are also identified as W 1, W 2, W3 and W4 in order of in¬ 
creasing wavelength. The selection of these four bands makes 
WISE an excellent instrument for studies of stellar struc¬ 
ture and interstellar processes of galaxies. The two shorter 
bands trace the stellar mass distribution in galaxies and the 
longer wavelengths map the warm dust emission and poly¬ 
cyclic aromatic hydrocarbon (PAH) emission, both tracing 
the current star formation activity. 
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We designed Radio Galaxy Zoo to cross-match complex 
radio sources with their host galaxy rather than simple, com¬ 
pact radio sources which are easily matched by algorithms. 
We filtered the FIRST radio catalogue based on two crite¬ 
ria: (1) the radio source has a signal-to-noise ratio (SNR) 
greater than 10; and (2) the radio source is considered to be 
resolved. We considered a source to be resolved if it satisfies 
the criterion: 


S^t < 10 _ 


0A\ 

logCSpeak) J 


(1) 


where Speak is the peak flux density in mjy beam -1 and 
Sint is the total flux density of the radio source in mjy. This 
selection criterion is indicated by the blue solid line in Fig. [2] 
and selects 218,228 radio sources from the FIRST catalogue. 
At low peak flux densities the scatter around S P eak/Si n t = 1 
rapidly increases due to intrinsic measurement errors on the 
peak and total fluxes, e.g. leading to unphysical situations 
where Speak > Sint- The larger number of sources below the 
S P eak/Sint = 1 line corresponds to real extended sources. 
Assuming that the 34,689 radio sources found in the area 
that is mirroring the relation (the green dashed line) rep¬ 
resents the 34,916 (16%) compact sources that can be ex¬ 
pected in our sample and are useful for control purposes. 
At the time of publication, a random subset of 174,821 out 
of the 218,228 fields of 3' x 3' from the FIRST survey have 
been made available to Radio Galaxy Zoo participants. 



10 100 1000 10e4 


Speak ( mJ y) 

Figure 2. The distribution of peak to integrated flux density 
ratio as a function of peak flux density for all FIRST radio 
sources with SNR> 10. The scatter in the flux ratio above the 
Speak/Sint = 1 line at low fluxes is the result of intrinsic errors 
on the peak and total flux density measurements. The points be¬ 
low the solid blue line represents the Radio Galaxy Zoo selection 
of extended sources. The mirror of this line around Speak/Sint = 1 
is shown by the green dashed line and demonstrates that a frac¬ 
tion of the selected sources will be compact. We estimate that our 
sample contains approximately 16% compact sources for control 
purposes. 


3.1.2 ATLAS and SWIRE 


The 4396 radio sources drawn from ATLAS cover 6.3 square 
degrees with 2.7 square degrees centred on the European 
Large Area ISO Survey South 1 field (ELAIS SI) and 3.6 
square degrees centred on the Chandra Deep Field South 
(CDFS). ATLAS reaches a la noise level of 16 /rjy beam - 1 


in ELAIS SI and 13 ^tJy beam 1 in CDFS (Franzen et al. 


2015). The angular resolution of the survey varies across the 


two regions with a mean of 12.2 x 7.6 in ELAIS SI and 
16.8” x 6.9” in CDFS. ATLAS was chosen because the two 
fields are considered the pilot fields for the EMU survey and 
as such the resolution and sensitivity limits are comparable 
to EMU. The 3.6/rm images come from the SWIRE sur¬ 
vey which covers 6.58 square degrees centred on CDFS and 
14.26 square degrees centred on ELAIS SI at 3.6, 4.5, 5.8, 
and 8.0/rm down to a 5 a noise level of 7.3, 9.7, 27.5, and 
32.5 /rJy ( |Lonsdale et al.||2003| ). A random subset of 2,397 
radio sources from ATLAS are currently offered to Radio 
Galaxy Zoo’s participants. 


3.2 Interface description 

Radio Galaxy Zoo was launched on December 17th 2013. 
This international online citizen science project is available 
in 8 languages (English, Spanish, Russian, Chinese, Pol¬ 
ish, French, German and Hungarian) and invites partici¬ 
pants to match radio sources with the corresponding infrared 
host galaxy following a decision tree similar to the original 
Galaxy Zoo project (Lintott et al.|2008). While Galaxy Zoo 


uses colour composite images of Sloan Digital Sky Survey 
(SDSS), Radio Galaxy Zoo enables the participant to tran¬ 
sition between the mid-infrared image and the radio 1.4 GHz 
image via a slider. 

The Radio Galaxy Zoo interface is shown in Fig. [3] The 
radio and infrared images are overlaid upon one another with 
the lowest contour and shading for the radio images pre-set 
at a 3a level, as shown by the blue contours in Fig. [3]) a) and 
(b). There is a continuum of transparency levels between 
the radio and infrared images with the default transparency 
position in the middle. When a participant transitions from 
the radio to the infrared image, the radio colour map will be 
gradually replaced by a set of contours (as shown in Fig.j3jr). 
The interface also includes a spotter’s guide containing ex¬ 
amples of radio sources matched to infrared sources, key¬ 
board shortcuts, a toggle function to turn on or off the radio 
contours, and a link to return to the tutorial. 

At the beginning, each participant is introduced to the 
project through the completion of a simple tutorial which 
guides them through the necessary steps to complete the 
classification of a single subject. The participant is required 
to follow three steps to make a classification: (1) select the 
radio contours that the participant considers to correspond 
to one radio source (Fig. i 3 ); (2) select the corresponding 
infrared host galaxy which corresponds to the selected ra¬ 
dio contours (Fig. HI 3 ); and (3) either continue classifying 
the remaining radio sources or progress to the next image 
(Fig. §:). For each step, the tutorial contains information 
on how to select the correct part of the image. After com¬ 
pletion of the tutorial a randomly selected image from the 
Radio Galaxy Zoo data set is immediately selected so that 
participants begin working on real data as soon as possible. 

Each Radio Galaxy Zoo subject is only offered once to 
each participant and is subsequently withdrawn from being 
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Figure 3. The Radio Galaxy Zoo interface illustrating the three 
steps required to make a classification. A single .3 7 X 3 7 field-of- 
view, which is designated as the “subject”, (a) Step 1: select the 
radio components that belong to a single radio source, (b) Step 
2: select the associated infrared galaxy that corresponds to the 
selected radio source, (c) Step 3: either continue classifying the 
remaining radio sources in the image or move on to the next sub¬ 
ject. All images were obtained from http://radio.galaxyzoo.org 


offered once the subject reaches a given threshold of classi¬ 
fications; this threshold is dependent on the complexity of 
the source. For sources with a single and/or connected set 
of radio contours, the vast majority of sources are expected 
to have a single IR galaxy counterpart (with the exception 
of blended sources that may originate from separate host 
galaxies). We record the nearest IR source to the partic¬ 
ipants’ clicks as the host galaxy. Such an identification re¬ 
quires fewer independent classifications, and so these images 
are retired from the interface after 5 classifications. For the 
remainder of the images, which have multiple radio compo¬ 
nents, a higher threshold of 20 classifications is adopted for 
higher accuracy. The data is stored in a mongoDB database 
structure with each click on the image recorded for each 
step. We record the positions of the corners of the box sur¬ 
rounding the selected radio contours and the position of the 
selected infrared host galaxy. 

After completion of the classification and prior to pro¬ 
gressing to the next radio source, particularly engaged par¬ 
ticipants can opt to discuss the subject in further detail 
through the RadioTalk forum. RadioTalk includes links to 
larger (9'x 9') FIRST and WISE images, images from NVSS 
(Condon et al. 19981 and optical observations from SDSS 
Data Release 10 (Ahn et al. 2014 1 and SDSS Data Re¬ 
lease 12 ( Alam et al.|2015 I for further detailed investigation. 
There is also a discussion board used for discussions on an 
object and for general help on the project as a whole. This 
is where the interaction between the science team and the 
volunteers occurs and many new candidate discoveries are 
further investigated. In Galaxy Zoo, these forum discussions 
resulted in the discovery of new classes of objects such as the 
Voorwerpjes and “Hanny’s Voorwerp” - an ionization light 
echo from a faded AGN ( Lintott et al.|2009 Keel et al.|2012 1 
as well as the “green peas”- [OIII] emission line-dominated 
compact star-forming galaxies (jCardamone et al~||2009l. 


3.3 User Base 

On May 1, 2015, Radio Galaxy Zoo had over 6900 regis¬ 
tered volunteers and 1,155,000 classifications. Each partic¬ 
ipant has the option of logging into the Zooniverse system 
which benefits the Radio Galaxy Zoo project by allowing us 
to identify the contributions made by individuals. There are 
102 participants (1.4%), each of whom has classified over 
1,000 subjects, and 11 of these (0.15%) who have classified 
over 10,000 subjects. Fig.[4]shows the distribution of classifi¬ 
cations of participants in the project. More than half (62%) 
of our project is completed by the top 1,000 volunteers (in 
terms of the number of subjects classified). Participants who 
choose not to log into the system still have their classifica¬ 
tions recorded; in the absence of other information, we use 
their IP addresses as substitute IDs. Anonymous users have 
generated 26.8% of the total classifications to date. 


4 EARLY DATA ANALYSIS 
4.1 Control Sample 

To perform a preliminary assessment of the morphological 
classifications currently completed by the Radio Galaxy Zoo 
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Figure 5. Example images of the different types of radio sources in our control sample (as described in Section |4.1| l. (a) A compact source 
with image processing artefacts. We included a number of compact radio sources in our control sample for the purpose of a consistency 
check, (b) Compact radio source, (c) Double-lobed radio source, (d) Bent morphology radio source, (e) A wide angle tail, (f) Unusual 
one-sided core-lobe radio source. The background image in all panels is the WISE 3.6 fim image and the contours are the FIRST image 
started at 3 times the local rms and increasing in multiples of 2. Each image is 3 ; X ?>' in size. The background colour scheme comes from 
CUBEHELIX l|Green|201~l). 
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Figure 4. Cumulative distribution of the total number of clas¬ 
sifications in Radio Galaxy Zoo as of 1 May 2015. Anonymous 
users who have not logged into the interface are responsible for 
27% of classifications; the top 100 registered users (dashed lines) 
have done an additional 42% of the total, while the top 1,000 
users (dot-dashed lines) are responsible for 62% of the registered 
classifications. 

volunteers, we use a collection of 100 images classified by 
10 members of the science team (Banfield, Kapinska, Mas¬ 
ters, Middelberg, Rudnick, Schawinski, Shabala, Simmons, 
Willett & Wong) as our control sample. Of the 100 subjects, 
57 are selected to represent a range of complex and/or un¬ 
usual morphologies seen in radio galaxies, including double 
and triple sources, bent and precessing jets, HyMoRS, and 
artifacts (see Fig.©). The remainder consisted of randomly 
selected images already classified by at least 20 volunteers; 
many of these include emission from compact radio mor¬ 
phologies with a single component. 

Our control sample of 100 subjects serves two purposes. 
The first is to compare the levels of agreement among the 
experts, which is critical for establishing cutoffs in the de¬ 
velopment of consensus algorithms. We establish the limit 
on the consensus by identifying the classes of Radio Galaxy 
Zoo subjects that are too complicated for a consensus to be 
reached by the expert science team. Classifications are sepa¬ 
rated into three categories corresponding to the vote fraction 
and consensus level of the classifiers: 

• Class A: all or all but one expert classifier(s) agree on 
the number of radio components per radio source and the 
location(s) of the IR counterpart; 

• Class B: 2 experts disagreed on the number of radio 
components or IR counterparts; and 

• Class C: 3 or more experts did not agree on the funda¬ 
mental radio/IR morphology. 

Of the 100 images, the science team classifications had 53 
in Class A, 31 in Class B, and 16 in Class C. Individual in¬ 
spection of the classifications for the Class C images reduced 
them to a final 10 that the science team agreed would re¬ 
quire genuine follow-up observations to distinguish between 
morphological categories. We assume that both Classes A 
and B meet thresholds for a unique classification, which we 


verify using joint inspection by the entire team. We thus 
tentatively assume that 90 per cent accuracy is the highest 
possible level that can be expected from group classification 
either from volunteers or experts. The cutoff for consensus 
on individual subjects can vary depending on the number of 
radio components and relative difficulty of the classification. 

Fig.© shows an example where the expert members of 
the science team could not reach a consensus on whether 
this subject contains two independent sources or a single 
double source, and whether the IR host was visible. In cases 
where there is significant disagreement between experts or 
volunteers, these sources will be deferred to further study 
where Bayesian-type analyses will assign a probability that 
the host has been correctly identified. These Bayesian anal¬ 
yses will be based on: (1) the probability that the two radio 
sources near the center are in fact part of a double; (2) the 
separation of the host from the centroid of the radio emis¬ 
sion; and (3) the luminosity and the colours of the host. 

We find that the source identifications from Radio 
Galaxy Zoo volunteers are as likely to disagree as the ex¬ 
perts for difficult or ‘unusual’ radio sources. Fig. ©shows an 
example field where there are multiple radio components. 
Although there is one clear identification with the bright el¬ 
liptical (SDSS ,1131424.68+621945.8) in a cluster of galaxies 
at 2 = 0.131, it is unclear how many individual sources there 
are in this image, or whether these are all detached pieces 
of the same radio galaxy, now being energized by turbulence 
or shocks in the intracluster medium. 

The second goal of the control sample was to assess the 
accuracy of the volunteers, both by looking at their relative 
agreement levels and by comparing their results to the ex¬ 
pert classifications. We measured the consensus for a subject 
using C = n C onsensus/n a ii, where n C onaensus is the number of 
volunteers who agreed on the arrangement and host galaxy 
ID for every radio component in the image, and n a ii is the 
total number of classifications for the image. We find that 
the mean consensus level is C = 0.67, indicating that the 
majority of images do have a single majority classification 
(without necessarily confirming whether this consensus is in 
fact correct). More than 75 per cent of the images in the 
control sample had C > 0.50, where the consensus included 
a majority of independent classifiers. We also found that the 
consensus is strongly related to the complexity of the image 
being classified. When there was only one radio source in the 
image, the mean consensus was C = 0.73 whereas for com¬ 
plex images with more than one radio source component, 
the mean consensus for the volunteers was C = 0.44. 

While the consensus categories of Class A, B or C 
provides a confidence level for the classifications made by 
both the experts and the volunteers, a “confident” classi¬ 
fication does not necessarily mean that the specific cross¬ 
identifications made by both the experts and the volunteers 
will agree. Hence, we also measure how well the volunteers 
agree with the experts for the 100 subject control sample. 
For 74 of the 100 control images, the consensus vote from 
the volunteers was the same as that selected by the science 
team (see Fig. ©). We note that the control sample was de¬ 
liberately selected to have a high percentage of morphologies 
which are difficult to classify, and so we expect the volun¬ 
teers’ performance on the full sample to significantly exceed 
this. The agreement of the consensus is also a strong func¬ 
tion of the expert level of agreement. For classes A, B, and C, 
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Figure 6. FIRSTJ111120.5+133123 — an example where there 
was no agreement between the expert panel on the source identi¬ 
fication. 
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Figure 7. An example of an unusual radio morphology identified 
by the Radio Galaxy Zoo volunteers. It is unclear whether all the 
radio sources in this subject are discrete components of the same 
source or if they are indeed independent sources. In this case, the 
host galaxy lies beyond the field of view of this subject. 


the consensus classification concurs with experts 83 per cent, 
50 per cent, and 36 per cent of the time, respectively. Dis¬ 
agreements between volunteers and experts with high levels 
of consensus are mostly driven by the identification of the 
IR source, rather than the radio components. For example, 
10 of 16 Class A or B subjects (as labelled by experts) with 
which the volunteers disagreed were due to either misposi- 
tioning of the IR counterpart or identification of a low S/N 
IR peak where the experts identified no source in the image. 
Table[l]compares the classification distributions between the 
experts and the volunteers. 


Figure 8. Consensus metrics for the 10 Radio Galaxy Zoo experts 
and volunteers for the control sample of 100 subjects with each 
point representing one radio source in our control sample. Filled 
circles show galaxies for which the consensus for both experts and 
volunteers exactly matched; open circles indicate if they disagreed 
in any way. Galaxies characterized as classes A, B, or C by the 
expert science team (Section |4. 1 1 ) are plotted in red, blue, and 
green, respectively. 


Table 1. Classification distributions of experts vs. volunteers for 
the control sample of 100 subjects in Radio Galaxy Zoo. Experts 
and volunteers agreed on the plurality classification for 74 out of 
100 galaxies; most disagreements were for cases where the experts 
are in better agreement than the volunteers or where the image 
has a complicated, Class C morphology. The plurality classifica¬ 
tion is the classification with the most classifications. 


Volunteers 

Experts 

A 

B 

C 

Agreed 

A 

24 

14 

9 

B 

2 

6 

13 

C 

0 

0 

6 

Disagreed 

A 

2 

2 

2 

B 

0 

2 

8 

C 

0 

0 

10 


4.2 Consensus algorithms 

The underlying data reduction relies on independent classi¬ 
fications by distinct users, and no individual subject is in¬ 
spected by the same user more than once. The validity of the 
single classification assumption is straightforward to verify 
for the 883,494 classifications (73.2 per cent) that come from 
volunteers who are logged in to the Radio Galaxy Zoo inter¬ 
face. For the remaining classifications by volunteers who did 













































10 J. K. Banfield et al. 


not establish logins, it is possible that some small fraction 
may have seen the subject more than once. Such duplicates 
are removed from the final catalogue. 

For the classifications of the radio emission for each 
subject, determining agreement between the participants 
is straightforward because the sets of contours are pre¬ 
identified. The participant has the option of picking only 
from within this limited set, although there are additional 
variables depending on which counterpart galaxies they as¬ 
sociate with the radio emission, and whether multiple ra¬ 
dio sources are considered as belonging to the same host 
galaxy or from separate sources. Consensus is first measured 
by taking the plurality vote (over all participants) for the 
unique combination of radio components assigned to differ¬ 
ent sources in the image The plurality vote is the option 
with the highest number of total votes. For very complex 
subjects, it should be noted that the plurality vote may not 
be the option selected by the majority of the participants. 

The host galaxy counterpart to the radio emission is 
selected by the volunteer clicking on any point within the 
subject image (Fig. §>)■ Determining consensus in this case 
is more challenging, since the fields may be crowded. Source 
densities of detections in the WISE all-sky catalogu^] range 
from « (1 — 2) x 10 4 deg -4 , corresponding to 25 — 50 sources 
per 3' x 3' Radio Galaxy Zoo image. Using the locations of 
all clicks within the image, we use a kernel-density estimator 
(KDE) to identify the host galaxy proposed by the partic¬ 
ipants via the clustering of their click positions which may 
differ by a few pixels but are likely to identify the same host 
galaxy (see Fig.[9|. Finally, we apply a local maximum filter 
to determine the number and location of detected peaks in 
the image, with the highest peak assigned as the location 
of the IR host. The only exception to this is if the plural¬ 
ity vote identified the radio lobes as having no visible IR 
counterpart; in that case, the KDE result is ignored and 
the consensus is assigned to “No IR counterpart”. In order 
to record the participants’ clicks to a greater precision, the 
pixel scale of the RGZ subjects (as presented in Fig. [9| is 
of a higher resolution than the native pixel scales from both 
the FIRST and WISE images. 

There is currently no weighting for individual partic¬ 
ipants in the Radio Galaxy Zoo processing. However, we 
are implementing a “gold sample” set of 20 subjects pre¬ 
sented to all our participants for the purpose of weighting 
the level of agreement between an individual participant’s 
classification to that of a science team member. These “gold 
sample” subjects are selected to have a range of morpholo¬ 
gies and classification difficulty, and are never removed from 
the broader classification pool. The participants are unaware 
of the exact subjects in the “gold sample”. Instead, a new 
“gold sample” subject is shown to every participant at reg¬ 
ular intervals (interspersed with the randomly-selected im¬ 
ages) until the participant has completed the classification 
of all 20 “gold sample” subjects. We will assemble the final 
Radio Galaxy Zoo catalogue using Bayesian estimators sim¬ 


ilar to those developed by Simpson et al. (2012) whereby 


the individual participant’s classification of the “gold sam¬ 
ple” will be used as seed weights for the determination of 


http://wise2.ipac.caltech.edu/docs/release/allsky/expsup/sec2. 


the final Bayesian classification, with the ground truth set 
by the science team’s responses for the same subjects. 

An overview of the reduced data for an example subject 
is shown in Fig. [9] In this particular example, we find that 
the cross-identifications made by the Radio Galaxy Zoo par¬ 
ticipants and the experts are consistent. On the other hand, 


simple nearest-galaxy-matching algorithms (e.g., McMahon 
et al.||2002 Kimball fe Ivezi6||2008 1 would classify this sub¬ 


ject as consisting of two separate radio sources, correspond¬ 
ing instead to the second most-common classification made 
by our Radio Galaxy Zoo participants. 


5 EARLY RESULTS 
5.1 WISE colours 

As an early test of the scientific returns of Radio Galaxy 
Zoo, we analyse the infrared colours of the host galaxies 
for the radio sources identified in the first twelve months 
of operation. It should be noted that none of the ATLAS- 
SWIRE subjects had been completed at this preliminary 
stage of the project. From the 53,229 images with completed 
classifications to date, we use the raw number of votes to 
identify the number and association of the radio components 
in the image. For those radio components, we use the result 
from the KDE fitting to locate the position in (RA and 
Dec) of the infrared counterpart, if users identified one. We 
then match the list of positions to the WISE all-sky catalog 
( Cutri fe et al.||2013 |. We matched 41,568 (78 per cent) of 
our radio sources to a WISE source within a radius of 6”. 
The radius is based on the size of the WISE beam at 3.4 /nn; 
the sky density of sources out of the Galactic plane gives a 
mean of 0.11 random WISE sources per search cone. The 
majority of such spurious associations have no W2 and/or 
W3 emission, and are thus excluded from further analysis. 
The remaining IR counterparts identified by RGZ are either 
low S/N peaks that do not pass the WISE threshold, or 
where the volunteers identified the radio source as having 
no apparent mid-IR counterpart. 

Of the Radio Galaxy Zoo sources with a WISE coun¬ 
terpart, we further restrict our analysis to those in which a 
clear identification has been made by limiting the sample to 
images in which at least 75 per cent of the volunteers agreed 
on the number and arrangement of the radio sources. This 
threshold is similar to the cutoffs used for the clean sam¬ 
ples in Galaxy Zoo jLintott et al. | j2008) and Galaxy Zoo 2 
( | Willett et al. 1 12013| ) , but weights the sample more heav¬ 
ily toward single-component and/or compact sources at the 
expense of images with extended or multi-lobe radio mor¬ 
phologies. We visually inspected several hundred subjects 
and found reasonable agreement with this cutoff. Therefore 
our 75 per cent consensus sample with WISE matches con¬ 
sists of 33,127 sources, or 62 per cent of the classified Radio 
Galaxy Zoo sources to date. 

Since the following analysis focuses on the infrared 
colour properties of galaxies, it requires a robust measure¬ 
ment of the infrared flux in multiple bands. We restrict the 
sample to those with profile S/N > 5 in W 1, W 2, and W 3. 
It should be noted that a S/N > 5 cut translates to the 
WISE photometric quality class ‘A’ and the higher S/N de- 
2.htmtections of class ‘B’ (as class ‘B’ is defined to have a S/N > 3; 
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Figure 9. Example of a processed RGZ subject (FIRSTJ124610.0+384838). Panel (a): 3' x 3' WISE 3.4 fi m image. The FIRST 1.4 
GHz emission is overlaid as white contours. Panel (b): 3' x 3' FIRST radio continuum image. Panel (c), left column: Kernel density 
estimator (KDE) used to determine the location of the IR source as pinpointed by visual identification. Panel (c), right column: Final 
consensus classifications, including both the FIRST radio emission components (contours) and the peak IR source or sources (stars). 
The top row (of panel c) shows the number one consensus classification by Radio Galaxy Zoo volunteers; the middle row (of panel c) 
shows the second-most common consensus among Radio Galaxy Zoo volunteers; and the bottom row (of panel c) shows the consensus of 
the expert Radio Galaxy Zoo science team. Both volunteers and the science team agree on the classification for this galaxy, which is of 
a double-lobed radio source with a single IR host at the centre. Nearest-position automated matching algorithms with a small matching 
radius (eg, the 30 " used by [Kimball fc Ivezic||2008| l would have split this image into two separate radio sources, corresponding to the 
second-most common identification by the Radio Galaxy Zoo volunteers. 


|Cutri fe et al.|[2013| . These comprise 100 per cent, 97 per 
cent, and 3(3 per cent, respectively, of the Radio Galaxy Zoo 
counterparts. The final set of galaxies with robust RGZ iden¬ 
tifications and clear WISE detections in three bands has a 
total of 4,614 galaxies. 

To compare our radio-detected sample to infrared- 
detected sources in general, we generated a sample of 2 x 10° 
points randomly selected from sources in the WISE All-Sky 
Catalog located within the FIRST footprint in the northern 
Galactic hemisphere (RA from 10—15 hr, dec from 0°—60°). 
This sample is limited to the same S/N > 5 cuts as for the 
Radio Galaxy Zoo sources, which is roughly 5 per cent of 


the total WISE sample^ This sample of « 1 x 10 s objects 
is used as a comparison control sample. 

In Fig. [lO] we plot the matched WISE-RGZ sources in 
the infrared colour-colour space, using profile-fitted mag¬ 
nitudes in the Wl, W 2, and W 3 bands where all WISE 
magnitudes are in the Vega system. Fig. 10 a) shows the 
4,614 WISE-RGZ sources from the 75 per cent consen¬ 
sus sample as black contours and compares our results to 
those from other recent studies. The underlying colourmap 
shows randomly selected sub-sample sources from the WISE 


4 http://wise2.ipac.caltech.edu/docs/release/prelim/expsup/sec2_2a.html 
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All-Sky catalog, and the green solid points represent the 
335 radio-detected galaxies sample cross-matched to WISE 


host galaxies by Giirkan et al. (2014). The red dashed wedge 
in Fig. 10 (a) demarcates the infrared colour region occupied 
by X-ray-bright AGN ( Lacy et al.|2004 Mateos et al.|2012 |. 
It should be noted that the overlap between the Gurkan 
sample and the WISE-RGZ samples is approximately 2.3 
per cent and does not significantly bias our conclusions. 

In the mid-IR bands covered by WISE, normal galaxies 
are expected to primarily populate a narrow mid-infrared 
colour band between 0.0 < (W1 — W2) < 0.7 and 0.5 < 
(W2 — W3) < 4.0. Since the longer (W2 — W 3) bands are 
more sensitive to dust produced in star formation, spiral 
galaxies typically have redder colours in the mid-IR than 
ellipticals (Wright et al. [2010). Various classes of active 
galaxies (including QSOs, Seyferts, and LINERs) as well 
as dusty [U]LIRGs, have very red colours at longer bands 
( W2 — W 3) > 2.0 and a broader range of colours than nor¬ 
mal galaxies at shorter bands (0.0 < (W1 — W2) < 2.5). The 
distribution of colours for the all-sky WISE objects spans 
the full range of templates for extragalactic objects shown 
in Fig. [Tojb), but the majority of bright objects at 12 pm 
(W3) have colours consistent either with stars or starburst 


galaxies/LINERs. Consistent with recent findings (Gurkan 
et al.|[2014l, the mid-infrared colour-colour plot appears to 


be a reasonable discriminator for many types of AGN (Lacy 
et al.|2004 Stern et al.|2012' Mateos et al.|20l2 (. It should 


be noted that the requirement for a detection in the W 3- 
band biases our results towards low-redshift radio galaxies, 
as strong W 3 emission from radio sources at high redshifts 
is rare. 

We find that the preliminary sample of WISE-RGZ ob¬ 
jects has a distinctly different distribution of mid-infrared 
colours from the randomly-selected all-sky sample. There are 
three primary loci. The first is at —0.2 < (W 1 — W2) < 0.3, 
0 < (W 2 — W3) < 1, which includes approximately 10 per 
cent of the Radio Galaxy Zoo sources. These colours are 
consistent with elliptical galaxies, which have older stel¬ 
lar populations and a lack of dust that results in rela¬ 
tively blue (W2 — W 3) colours. The second locus of Ra¬ 
dio Galaxy Zoo sources lies near 0.7 < (W 1 — W2 ) < 1.5, 
2.0 < (W2—W3) < 3.5 (approximately 15 per cent of the to¬ 
tal), corresponding to infrared colours typically associated 
with QSOs and Seyfert galaxies. The infrared colours are 
based on a strong non-thermal component from the accre¬ 
tion disk around the black hole. The third locus of Radio 
Galaxy Zoo sources lies near 0.1 < [W1 — W2) < 0.5, 
3.5 < (W2 — W 3) < 4.8; these are the reddest colours in 
(W2 — W 3), most commonly associated with luminous in¬ 
frared galaxies (LIRGs). This is the largest concentration of 
Radio Galaxy Zoo sources in colour-colour space, including 
approximately 30 per cent of Radio Galaxy Zoo sources with 
W3 measurements. 

The remainder of the population of Radio Galaxy Zoo 
sources are distributed along the loci of both normal and 
active galaxies. This is largely due to the fact that a sub¬ 
set of the Radio Galaxy Zoo sample consists of compact 
radio sources where star formation is the dominant mech¬ 
anism for the observed radio emission. The lack of objects 
at (W2 — W3) < 0 indicates Radio Galaxy Zoo is almost 
entirely free of stellar contamination. There are also very 
few WISE-RGZ galaxies at the reddest (1F1 — W2) colours, 


indicating a lack of [U]LIRGs or very highly obscured AGN. 
This is consistent with results from Sajina et al. (2007|), who 
show that ULIRGs at 2 < 1 are primarily radio-quiet (al¬ 
though there is a larger radio-loud sample at 2 > 2). 


The radio-loud galaxies from the Gurkan et al. (2014) 


sample agree with the clustering of QSO-like Radio Galaxy 
Zoo sources with red ( W1 — W2) colours, although the re¬ 
mainder are distributed more evenly in (W2 — W 3); their 
galaxies do not show the same concentration of ellipticals, 
and have almost no examples similar to LIRGs. Using the 


AGN wedge” defined by Lacy et al. (2004) and Mateos et al. 


(20121 as an AGN diagnostic, Gurkan et al. (2014) find that 


49 per cent of their galaxies satisfy the AGN criteria as cal¬ 


ibrated from a bright X-ray sample (Fig. 10 1 ). This is a 


powerful diagnostic for the presence of an AGN, as only 9 
per cent of the WISE all-sky extragalactic sources have sim¬ 
ilar colours. However, it is clearly not a complete sample, as 
more than half of their radio-loud galaxies fall outside this 
locus. The fraction of WISE-RGZ sources falling within the 
‘AGN wedge’ is very similar, accounting for 40 per cent of 
our sample. Analysis in future papers will probe the differ¬ 
ences between the samples, including the likely dependence 
on radio luminosity from brighter radio galaxies. 

The population of Radio Galaxy Zoo host galaxies that 
have infrared colours consistent with massive elliptical hosts 


agrees with previous observations at low redshift (e.g. Au- 
riemma et al.|1977 Dunlop et al.|2003 (. This is typically ex¬ 
plained as the result of the accretion of smaller neighbouring 
galaxies, in which the resulting host is an elliptical galaxy 
and the radio-loud jets are launched from the recently-fueled 
central black hole. To date, four examples of spiral galaxies 
hosting a double-lobed radio source have been discovered 


(Morganti et al. |2011 Hota et al. |2011 Bagchi et al.|2014 


Mao et al.||2015 1 and Radio Galaxy Zoo has identified sev 
eral such new candidates. An optical follow-up of these can¬ 
didates will determine the morphology of these hosts and 
the relative accuracy of IR colour as a proxy. 

The distribution of sources in the elliptical region, how¬ 
ever, is significantly different for Radio Galaxy Zoo sources 
vs. “normal” elliptical galaxies detected in the all-sky cata¬ 
logue. Fig. 11 shows the distribution of (W2— W 3) for both 


populations. There is a clear peak for both all-sky sources 
and Radio Galaxy Zoo hosts around (W2 — W3) = 0. How¬ 
ever, the Radio Galaxy Zoo hosts have a significant popula¬ 
tion of galaxies with redder colours, out to (W2—W3) ~ 1.5. 
Such a result suggests that the Radio Galaxy Zoo host galax¬ 
ies may have enhanced dust masses over quiescent ellipticals, 
which would contribute to redder mid-infrared colours. This 
hypothesis is consistent with previous optical studies which 
found that dust is prevalent in the cores of the host galaxies 


of 3CR radio sources (e.g. Martel et al. 19991. 

On the other hand, the emission from star-forming 
galaxies is likely to contribute to the redder mid-infrared 
colours as approximately 16 per cent of the FIRST-derived 
Radio Galaxy Zoo sample consists of compact radio sources. 
However, we cannot distinguish between AGN-dominated 
radio emission in galaxies with on-going star formation from 
those galaxies where the the AGN radio emission is negligi¬ 
ble. The peak that we find that is redder than the elliptical 
population may be a result of a combination of dusty ellipti¬ 
cals and some star-forming spirals as we have not attempted 


to split these. Although Tadhunter et al. (20141 find simi- 
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Figure 10. Panel (a): WISE colour-colour diagram, showing ~ 10 5 sources from the WISE all-sky catalog (colourmap), 4,614 sources 
from the 75 per cent Radio Galaxy Zoo catalogue (black contours), and powerful radio galaxies (green points) from|Gii rkan et al.|j2014| . 
The wedge used to identify IR colours of X-ray-bright AGN from [Mateos et al.| ( |2012] > is overplotted (red dashes). Only 10 per cent of 
the WISE all-sky sources have colours in the X-ray bright AGN wedge; this is contrasted with 40 per cent of Radio Galaxy Zoo and 49 
per cent of the |Giirkan et a l. ([2014) radio galaxies. The remaining Radio Galaxy Zoo sources have WISE colours consistent with distinct 
populations of elliptical galaxies and LIRGs, with smaller numbers of spiral galaxies and starbursts. Panel (b): WISE colour-colour 
diagram showing the locations of various classes of astrophysical objects (adapted from Fig. 12 in Wright et al. |2010|l. 


lar enhancement of dust masses for radio-loud galaxies at 
0.05 < z < 0.7 based on Herschel data, a recent study by 


Rees et al. (20151 finds no difference in IR colours between 


radio-loud and radio-quiet elliptical host galaxies. The prop¬ 
erties of the Radio Galaxy Zoo elliptical population will be 
fully explored in a follow-up paper. 


The association of radio-loud hosts with LIRGs (but 
not ULIRGs) is also unusual, since only a small fraction 
of LIRGs are associated with late-stage mergers (Stier- 
|walt et al. 2013). The radio-continuum properties of 46 
LIRGs from the Great Observatories All-sky LIRG Survey 
(GOALS) show that 45 per cent of galaxies with radio emis¬ 
sion have radio properties resembling pure AGN, rather than 


star burst or starburst-AGN composites (Vardoulaki et al. 


2015). We note that this result is based on a sample of 46 


low-redshift (z < 0.088) LIRGs—a small fraction of the total 
GOALS sample of 202 galaxies. Results from Radio Galaxy 
Zoo, both by matching the hosts and measuring extended 
radio morphology vs. compact sources, can better quantify 
this trend as a function of redshift. 


The clustering of radio-detected WISE counterparts in 
all three loci (ellipticals, QSOs, and LIRGs) and their dif¬ 
ference from random all-sky WISE sources strongly implies 
that Radio Galaxy Zoo classifiers are accurately matching 
the radio lobes to their host galaxies. Spurious associations 
would result in infrared colours which are more consistent 
with stars or starburst galaxies. These early results (which 
have not been subject to explicit user weighting or outlier 
rejection) reinforce the ability of crowdsourced volunteers to 
carry out tasks useful for astronomical research in a reliable 



( W2-W3) 


Figure 11. Distribution of (W2 — W 3) infrared colours for 
objects near the locus typically identified as elliptical galax¬ 
ies (where (W 1 — W 2) < 0.5). Solid and dashed vertical lines 
show the median colours of the all-sky and RGZ sources. While 
sources randomly selected from the WISE all-sky sample peak 
near (W 2 — W 3) = 0, our current RGZ sample shows a large 
population with significantly redder colours—possibly from star¬ 
forming galaxies and/or ellipticals with enhanced dust. 


manner. 
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Figure 12. An example of how some of the volunteers recognise that they might be looking at only a piece of a radio source, and then 
use the provided links in RadioTalk to examine larger fields and other surveys. The three small insets labelled A, B, and C are 3 / x 3' in 
size representing the Radio Galaxy Zoo images presented to the participants. The much larger field (11.5' x 11.5 7 ) shows that this is part 
of a very large radio triple, with a 670 /x angular size from hot spot A to hot spot C. The background image is the WISE mid-infrared 
image and the contours show the FIRST radio data with the contours starting at 3 times the local rms (0.14 mJy beam -1 ) and increasing 
by a factor of 2. The background colour scheme comes from CUBEHELIX j Green|201 1). 
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5.2 New discoveries through RadioTalk 


The most beneficial features of the RadioTalk online forum 
are: (1) the links to larger fields and other complementary 
surveys; and (2) the discussion board. Fig. 12 illustrates one 
example of the power of RadioTalk. The Radio Galaxy Zoo 
image size presented to the participants is 3' x 3' and three 
such squares are shown in Fig. |12fc , - c. Using the tools pro¬ 
vided in RadioTalk, it became apparent to several Radio 
Galaxy Zoo participants that the radio components observed 
in these three subjects are part of the same radio source ex¬ 
tending 11.1' in angular size (large panel in Fig. 121. The 
optical host galaxy is SDSS J123458.46+531851.3 and has a 
photometric redshift of 2 = 0.62 + 0.1. This source had been 
found in an independent visual search for giant radio sources 
in the NVSS (Andernach et al. 2012). The overall radio size 
of 4.6 Mpc makes it the third-largest radio galaxy known 
(Andernach, priv. comm. 2014). Given the presence of the 
unrelated radio and infrared sources in this field, only a vi¬ 
sual inspection would allow the identification of this triple 
radio source. 


Even for radio sources much less extended than the one 
presented in Fig. |12| automated algorithms based on: (1) 
nearest position matching (e.g. McMahon et al.|2002 Kim¬ 


ball fe Ivezic|2 008|>; or (2) a combination of position match¬ 


ing with a specific search for double-lobes (e.g. van Velzen 


et al. 2015) can be confused by the presence of multiple 


discrete components typical of non-compact radio sources. 
Fig. [13] shows an example of a radio source with a bent, 
double-lobed morphology in a galaxy group at 2 = 0.073. 
Apart from the radio emission from the core, an automated 
algorithm will have difficulty in determining whether the dis¬ 
crete components are lobes belonging to the core or if they 
are independent sources. On the other hand, there is strong 
agreement between the visual classifications by the RGZ vol¬ 
unteers and the experts that all the visible radio components 
are part of the same bent radio source structure hosted by 
the galaxy, SDSS J131904.16+293834.8. 


The discovery from RadioTalk of a re-started jet in a 
WAT found within a few days of the Radio Galaxy Zoo 
launch was unexpected. We have since conducted follow-up 
spectroscopic observations to determine the redshift of the 
object, as well as deeper radio continuum observations with 
the VLA. 


I 11 addition to unexpected discoveries, we also have on¬ 
going collaborations between the scientists and the Radio 
Galaxy Zoo volunteers on various research topics. Typi¬ 
cally, the scientists will communicate directly with the Radio 
Galaxy Zoo volunteers by explaining their interests in a par¬ 
ticular object or phenomenon and then request help in col¬ 
lating lists of possible candidates from the objects that have 
been inspected. Currently, the projects being facilitated by 
RadioTalk include: (1) the search for hybrid radio sources 
where one radio source appears to have both FRI and FRII 
characteristics (known as HyMoRS; Kapinska et al. (2015) 
submitted to MNRAS); (2) the search for double-lobed ra¬ 
dio sources associated with spiral host galaxies (led by Mao); 
and (3) the identification of giant radio galaxies (led by An¬ 
dernach) . 


o 
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Figure 13. An example of a galaxy where visual identifica¬ 
tion of the radio components is necessary. The automated algo¬ 
rithms would have classified the non-core emission as independent 
sources, whereas the Radio Galaxy Zoo volunteers (in agreement 
with the experts) find all five radio emission components in the 
upper half of the image to be related to the same source. 


6 SUMMARY 

Radio Galaxy Zoo is an online citizen science project oper¬ 
ating within the Zooniverse initiative where volunteers can 
contribute towards current research projects. The primary 
purpose of Radio Galaxy Zoo is to obtain host identifications 
for radio sources from wide-field and eventually all-sky radio 
surveys. In preparation for the next generation of all-sky ra¬ 
dio surveys, such as EMU which will yield 70 million sources, 
we are also testing the viability of citizen science as an al¬ 
ternative technique for inspecting such large datasets. In its 
first and current incarnation launched publicly in December 
2013, we are cross-matching the FIRST and ATLAS radio 
surveys to mid-infrared images from the WISE and SWIRE 
surveys. 

By combining the work of more than 4,000 participants 
in the first 12 months of operation, we have obtained more 
than 30,000 host identifications from Radio Galaxy Zoo with 
greater than 75 per cent consensus. By matching these to 
nearby WISE detections, we find that the majority of our 
current sample of radio sources reside in mid-infrared colour- 
colour regions that are known to be occupied by elliptical 
galaxies, QSOs, and LIRGs. This result is consistent with 
canonical understanding whereby radio-loud sources are pri¬ 
marily affiliated with elliptical galaxies and late-stage merg¬ 
ers. We also find a significant population of Radio Galaxy 
Zoo sources with redder mid-infrared colours than normal el¬ 
liptical galaxies. This is either IR emission from star-forming 
galaxies or evidence of enhanced dust content. Further anal¬ 
ysis will examine how the association with the host depends 
on radio morphology and power. 

While we still have a significant population of sources 
yet to be quantified (> 80 per cent), we do find that the 
project participants are as effective as the science team at 
identifying host galaxies for sources which are currently too 
complex (due to a combination of structures and/or number 
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of source components) for simple position-matching auto¬ 
mated algorithms. In addition, the experienced participants 
are also very successful at the identification of radio source 
candidates which extend beyond the given 3' x 3' field. How¬ 
ever, it should be noted there remains a significant number 
of radio sources at the 10 -20 per cent level which are too 
complex to allow an unambiguous identification of the host 
without further follow-up observations. 

Additionally, through the collaborative efforts between 
participants and the science team, we have discovered mul¬ 
tiple examples of unusual radio galaxies, including spiral 
galaxies with extended double-lobed radio emission and new 
HyMoRS. 
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